Polyphonic transcription by non-negative sparse coding of power spectra

نویسندگان

  • Samer M. Abdallah
  • Mark D. Plumbley
چکیده

We present a system for adaptive spectral basis decomposition that learns to identify independent spectral features given a sequence of short-term Fourier spectra. When applied to recordings of polyphonic piano music, the individual notes are identified as salient features, and hence each short-term spectrum is decomposed into a sum of note spectra; the resulting encoding can be used as a basis for polyphonic transcription. The system is based on a probabilistic model equivalent to a form of noisy independent component analysis (ICA) or sparse coding with non-negativity constraints. We introduce a novel modification to this model that recognises that a short-term Fourier spectrum can be thought of as a noisy realisation of the power spectral density of an underlying Gaussian process, where the noise is essentially multiplicative and non-Gaussian. Results are presented for an analysis of a live recording of polyphonic piano music.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sound Source Separation Using Sparse Coding with Temporal Continuity Objective

A data-adaptive sound source separation system is presented, which is able to extract meaningful sources from polyphonic real-world music signals. The system is based on the assumption of non-negative sparse sources which have constant spectra with time-varying gain. Temporal continuity objective is proposed as an improvement to the existing techniques. The objective increases the robustness of...

متن کامل

Sparse Non-negative Matrix Factor 2-D Deconvolution for Automatic Transcription of Polyphonic Music

We present a novel method for automatic transcription of polyphonic music based on a recently published algorithm for non-negative matrix factor 2-D deconvolution. The method works by simultaneously estimating a time-frequency model for an instrument and a pattern corresponding to the notes which are played based on a log-frequency spectrogram of the music.

متن کامل

Sparse representations of polyphonic music

We consider two approaches for sparse decomposition of polyphonic music: a timedomain approach based on shift-invariant waveforms, and a frequency-domain approach based on phase-invariant power spectra. When trained on an example of a MIDI-controlled acoustic piano recording, both methods produce dictionary vectors or sets of vectors which represent underlying notes, and produce component activ...

متن کامل

Drum Transcription in Polyphonic Music Using Non-Negative Matrix Factorisation

We present a system that is based on the non-negative matrix factorisation (NMF) algorithm and is able to transcribe drum onset events in polyphonic music. The magnitude spectrogram representation of the input music is divided by the NMF algorithm into source spectra and corresponding time-varying gains. Each of these source components is classified as a drum instrument or non-drum sound and a ...

متن کامل

Transcribing Bach Chorales Limitations and Potentials of Non-Negative Matrix Factorisation

This article discusses our research on polyphonic music transcription using non-negative matrix factorisation (NMF). The application of NMF in polyphonic transcription offers an alternative approach in which observed frequency spectra from polyphonic audio could be seen as an aggregation of spectra from monophonic components. However, it is not easy to find accurate aggregations using a standar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004